Accelerated network traffic sampling using an accelerated line card

ABSTRACT

A method and system of accelerating monitoring of network traffic. The method may include receiving, at a network chip of a network device, a network traffic data unit; capturing, by the network chip, the network traffic data unit based on a traffic sampling rate; adding, by the network chip, a sampling header to the network traffic data unit to obtain a sampled network traffic data unit; sending the sampled network traffic data unit from the network chip to a sampling engine; receiving, from the sampling engine, a flow datagram that includes a network traffic data unit portion and a flow datagram header; generating a flow network data traffic unit that includes the flow datagram; and transmitting the flow network data traffic unit towards a collector.

BACKGROUND

Networks of interconnected devices (e.g., computer networks) are oftenmonitored to, for example, ascertain characteristics of network trafficflow of the network. Such monitoring may be implemented via samplingsome portion of the network traffic (e.g., packets, frames, etc.)transmitted into, out of, or within the network to ascertain variousitems of information related to the network traffic. However, suchprocessing may require that one or more processors of a network devicein the network perform at least a portion of the activities required forsampling functionality. Such activities may place a workload burden onthe processors, which may affect network device performance.

SUMMARY

In general, in one aspect, the invention relates to a method ofaccelerating monitoring of network traffic. In one or more embodimentsof the invention, the method includes receiving, at a network chip of anetwork device, a network traffic data unit; capturing, by the networkchip, the network traffic data unit based on a traffic sampling rate;adding, by the network chip, a sampling header to the network trafficdata unit to obtain a sampled network traffic data unit; sending thesampled network traffic data unit from the network chip to a samplingengine; receiving, from the sampling engine, a flow datagram thatincludes a network traffic data unit portion and a flow datagram header;generating a flow network data traffic unit that includes the flowdatagram; and transmitting the flow network data traffic unit towards acollector.

In general, in one aspect, the invention relates to a non-transitorycomputer readable medium including instructions that, when executed by aprocessor, configure a network device to perform a method ofaccelerating monitoring of network traffic. In one or more embodimentsof the invention, the method includes receiving, at a network chip of anetwork device, a network traffic data unit; capturing, by the networkchip, the network traffic data unit based on a traffic sampling rate;adding, by the network chip, a sampling header to the network trafficdata unit to obtain a sampled network traffic data unit; sending thesampled network traffic data unit from the network chip to a samplingengine; receiving, from the sampling engine, a flow datagram thatincludes a network traffic data unit portion and a flow datagram header;generating a flow network data traffic unit that includes the flowdatagram; and transmitting the flow network data traffic unit towards acollector.

In general, in one aspect, the invention relates to a method ofaccelerating monitoring of network traffic. In one or more embodimentsof the invention, the method includes receiving, at a sampling engineand from a network chip, a sampled network traffic data unit comprisinga sampling header and a network traffic data unit; processing thesampled network traffic data unit to obtain sample information;truncating the network traffic data unit to obtain a network trafficdata unit portion; generating a flow sample header that includes thesample information; storing, in storage of the sampling engine, a flowsample that includes the flow sample header and the network traffic dataunit portion; constructing a flow datagram that includes the flow sampleand a plurality of other flow samples; sending the flow datagram to thenetwork chip; and clearing the flow sample and the plurality of otherflow samples from the storage of the sampling engine.

In general, in one aspect, the invention relates to a non-transitorycomputer readable medium including instructions that, when executed by aprocessor, configure a network device to perform a method ofaccelerating monitoring of network traffic. In one or more embodimentsof the invention, the method includes receiving, at a sampling engineand from a network chip, a sampled network traffic data unit thatincludes a sampling header and a network traffic data unit; processingthe sampled network traffic data unit to obtain sample information;truncating the network traffic data unit to obtain a network trafficdata unit portion; generating a flow sample header that includes thesample information; storing, in storage of the sampling engine, a flowsample that includes the flow sample header and the network traffic dataunit portion; constructing a flow datagram that includes the flow sampleand a plurality of other flow samples; sending the flow datagram to thenetwork chip; and clearing the flow sample and the plurality of otherflow samples from the storage of the sampling engine.

In general, in one aspect, the invention relates to a system foraccelerating monitoring of network traffic. In one or more embodimentsof the invention, the system includes a network chip configured toreceive a network traffic data unit; select the network traffic dataunit based on a traffic sampling rate; add a sampling header to thenetwork traffic data unit to obtain a sampled network traffic data unit;send the sampled network traffic data unit from the network chip to asampling engine; receive, from the sampling engine, a flow datagram thatincludes a network traffic data unit portion; generate a flow networkdata traffic unit that includes the flow datagram; and send the flownetwork data traffic unit to a collector. In one or more embodiments ofthe invention, the system also includes the sampling engine operativelyconnected to the network chip, including storage, and configured toreceive, at the sampling engine and from the network chip, a samplednetwork traffic data unit that includes the sampling header and thenetwork traffic data unit; process the sampled network traffic data unitto obtain sample information and the network traffic data unit; truncatethe network traffic data unit to obtain a network traffic data unitportion; generate a flow sample header comprising the sampleinformation; store, in storage of the sampling engine, a flow samplethat includes the flow sample header and the network traffic data unitportion; construct the flow datagram including the flow sample and aplurality of other flow samples; send the flow datagram to the networkchip; and clear the flow sample and the plurality of other flow samplesfrom the storage of the sampling engine.

Other aspects of the invention will be apparent from the followingdescription and the appended claims.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 shows a system in accordance with one or more embodiments of theinvention.

FIG. 2 shows a system in accordance with one or more embodiments of theinvention.

FIG. 3 shows a system in accordance with one or more embodiments of theinvention.

FIG. 4A shows an exemplary sampled network traffic data unit structurein accordance with one or more embodiments of the invention.

FIG. 4B shows an exemplary flow sample structure in accordance with oneor more embodiments of the invention.

FIG. 4C shows an exemplary flow datagram structure in accordance withone or more embodiments of the invention.

FIG. 4D shows an exemplary flow network traffic data unit structure inaccordance with one or more embodiments of the invention.

FIG. 5 shows a flowchart in accordance with one or more embodiments ofthe invention.

FIG. 6 shows a flowchart in accordance with one or more embodiments ofthe invention.

FIG. 7 shows a flowchart in accordance with one or more embodiments ofthe invention.

FIG. 8 shows a flowchart in accordance with one or more embodiments ofthe invention.

FIG. 9A shows an example in accordance with one or more embodiments ofthe invention.

FIG. 9B shows an example in accordance with one or more embodiments ofthe invention.

FIG. 9C shows an example in accordance with one or more embodiments ofthe invention.

FIG. 9D shows an example in accordance with one or more embodiments ofthe invention.

DETAILED DESCRIPTION

Specific embodiments will now be described with reference to theaccompanying figures. In the following description, numerous details areset forth as examples of the invention. It will be understood by thoseskilled in the art, and having the benefit of this Detailed Description,that one or more embodiments of the present invention may be practicedwithout these specific details and that numerous variations ormodifications may be possible without departing from the scope of theinvention. Certain details known to those of ordinary skill in the artmay be omitted to avoid obscuring the description.

In the following description of the figures, any component describedwith regard to a figure, in various embodiments of the invention, may beequivalent to one or more like-named components described with regard toany other figure. For brevity, descriptions of these components will notbe repeated with regard to each figure. Thus, each and every embodimentof the components of each figure is incorporated by reference andassumed to be optionally present within every other figure having one ormore like-named components. Additionally, in accordance with variousembodiments of the invention, any description of the components of afigure is to be interpreted as an optional embodiment, which may beimplemented in addition to, in conjunction with, or in place of theembodiments described with regard to a corresponding like-namedcomponent in any other figure.

Throughout the application, ordinal numbers (e.g., first, second, third,etc.) may be used as an adjective for an element (i.e., any noun in theapplication). The use of ordinal numbers is not to imply or create anyparticular ordering of the elements nor to limit any element to beingonly a single element unless expressly disclosed, such as by the use ofthe terms “before”, “after”, “single”, and other such terminology.Rather, the use of ordinal numbers is to distinguish between theelements. By way of an example, a first element is distinct from asecond element, and the first element may encompass more than oneelement and succeed (or precede) the second element in an ordering ofelements.

In general, embodiments of the invention relate to a system, method,and/or non-transitory computer readable medium for accelerating networktraffic (i.e., flow) sampling. Specifically, in one or more embodimentsof the invention, network traffic may be received by a network chip of anetwork device. A portion of the network traffic may be selected assampled network traffic, mirrored (e.g., copied), and sent to a samplingengine. In one or more embodiments of the invention, the sampling engineperforms some additional processing to obtain information related tosampled network traffic data units, aggregates the information relatedto some number of such samples, and sends the data to a network chip asa flow datagram. In one or more embodiments of the invention, thenetwork chip further processes and/or packages the flow datagram, andthen sends it to one or more collectors of sampled flow data (e.g.,sFlow collectors). In one or more embodiments of the invention, theperformance of activities related to network traffic sampling by thesampling engine and network chip reduces the workload on various othercomponents of the network device, such as one or more network deviceprocessors, which may improve overall network device performance.

In one or more embodiments of the invention, one or more network chipsand one or more sampling engines of a network device are included in thesame line card of the network device, while other line cards of thenetwork device may not include any sampling engines. In such embodimentsof the invention, network traffic may be sampled for network chips online cards without sampling engines by transmitting network trafficsamples to sampling engines on line cards of the network device thatinclude such sampling engines. In one or more embodiments of theinvention, some or all of the functionality described herein as beingperformed by a sampling engine may additionally or alternatively beperformed by one or more network chips.

FIG. 1 shows a system in accordance with one or more embodiments of theinvention. As shown in FIG. 1, the system includes a network device(100). The network device (100) may include an accelerated sampling linecard (102) and a non-accelerated sampling line card (104). In one ormore embodiments of the invention, an accelerated sampling line card(102) includes one or more network chips (e.g., network chip A (106),network chip B (108)) and at least one sampling engine (114). In one ormore embodiments of the invention, a non-accelerated sampling line card(104) includes one or more network chips (e.g., network chip C (106),network chip D (108)), but no sampling engines. Each of these componentsis described below.

In one or more embodiments of the invention, a network device (100) maybe a physical device that includes and/or may operatively connected topersistent storage (not shown), memory (e.g., random access memory(RAM)) (not shown), one or more processor(s) (e.g., integrated circuits)(not shown), and two or more physical network interfaces or ports (notshown). In one or more embodiments of the invention, the one or moreprocessors of a network device (e.g., a central processing unit) areseparate components from a network chip, one or more of which may alsobe components of a network device, and are discussed further below. Asused herein, the term operatively connected, or operative connection,means that there exists between elements/components a direct or indirectconnection that allows the elements to interact with one another in someway. For example, such elements may exchange information, sendinstructions to perform actions, cause changes in state and/or operatingcondition, etc.

In one or more embodiments of the invention, a network device (100)includes functionality to receive network traffic data units (e.g.,frames, packets, etc.) at any of the physical network interfaces of thenetwork device and to process the network traffic data units todetermine whether to: (i) drop the network traffic data unit; (ii)process the network traffic data unit in accordance with one or moreembodiments of the invention; and/or (iii) transmit the network trafficdata unit, based on the processing, from another physical networkinterface or port on the network device (100) in accordance with one ormore embodiments of the invention.

In one or more embodiments of the invention, the network device (100)also includes software and/or firmware stored in any network devicestorage (not shown) and/or network device memory (not shown) (i.e.,non-transitory computer readable mediums). Such software may includeinstructions which, when executed by the one or more processors (notshown) of the network device, cause the one or more processors toperform operations in accordance with one or more embodiments of theinvention. The software instructions may be in the form of computerreadable program code to perform embodiments of the invention may bestored, in whole or in part, temporarily or permanently, on anon-transitory computer readable medium such as a CD, DVD, storagedevice, a diskette, a tape, flash memory, physical memory, or any othercomputer readable storage medium. Specifically, the softwareinstructions may correspond to computer readable program code that whenexecuted by a processor(s), is configured to perform functionalityrelated to embodiments of the invention. The functionality of a networkdevice (100) is not limited to the aforementioned examples.

Examples of a network device (100) include, but are not limited to, anetwork switch, a router, a multilayer switch, a fibre channel device,an InfiniBand® device, etc. A network device (100) is not limited to theaforementioned specific examples.

In one or more embodiments of the invention, the network device (100)also includes any number of network chips (e.g., network chip A (106),network chip b (108), network chip C (110), network chip D (112)). Inone or more embodiments of the invention, a network chip (106, 108, 110,112) is any hardware (e.g., circuitry), software, firmware, and/orcombination thereof that includes functionality to receive, process,and/or transmit network traffic data units in accordance with one ormore embodiments of the invention. In order to perform suchfunctionality, a network chip (106, 108, 110, 112) may include anynumber of components. Such components may include, but are not limitedto, one or more processors, one or more buffers (e.g., for implementingreceive and/or transmit queues, such as virtual output queues (VOQs)),any type or amount of non-volatile storage, and/or any type or amount ofvolatile storage (e.g., RAM). A network chip (106, 108, 110, 112) mayalso include and/or be operatively connected to any number of physicalnetwork interfaces (not shown) (e.g., transceivers) of the networkdevice (100). Such interfaces may provide a path external to the networkdevice (100) (e.g., to other devices), or may be operatively connectedto other components internal to the network device (100), and each suchinterface may be an ingress and/or egress interface. In one or moreembodiments of the invention, a network chip (106, 108, 110, 112) may beor include one ore more application specific integrated circuits(ASICs).

As a non-limiting example, a network chip may be hardware that receivesnetwork traffic data units at an ingress port, and determines out ofwhich egress port on the network device (100) to forward the networktraffic data units such as, for example, media access control (MAC)frames that may include Internet Protocol (IP) packets. Network chipsare discussed further in the description of FIG. 2, below.

In one or more embodiments of the invention, one or more network chips(106, 108, 110, 112) may be included as part of a line card (e.g.,accelerated line card (102) or a non-accelerated line card (104)). Inone or more embodiments of the invention, a line card is a collection ofhardware (e.g., circuitry) that includes functionality to provideoperative connectivity between various network chips (106, 108, 110,112) and other components (e.g., physical interface ports, processors,storage, memory, sampling engines, etc. and software components (seeFIG. 3 description below)) of a network device. A network device mayinclude any number of line cards without departing from the scope of theinvention.

In one or more embodiments of the invention, an accelerated line card(102) is a line card that includes one or more sampling engines (114).In one or more embodiments of the invention, a sampling engine (114) isa collection of hardware (e.g., circuitry), software, firmware, and/orany combination thereof configured to perform at least a portion of thefunctionality described herein (e.g., an ASIC). For example, a samplingengine (114) may be a field programmable gate array (FPGA), whichincludes various circuitry components and storage (e.g., static randomaccess memory (SRAM), flash memory, etc.) for storing computationallogic and performing various operations based, at least in part, on suchstored logic. A sampling engine (114) may be operatively connected toany number of network chips (e.g., network chips (106, 108) located onthe same accelerated line card as the sampling engine, or indirectlyconnected network chips (110, 112) included in non-accelerated linecards (104)). In one or more embodiments of the invention, a samplingengine (114) includes functionality to receive sampled network trafficdata units, to process the sampled network traffic data units to obtainflow samples, to aggregate such flow samples, and to propagate theaggregated flow samples and related information as flow datagramstowards one or more collectors (e.g., via one or more network chips andone or more physical network interfaces of the network device). Samplingengines are discussed further in the description of FIG. 2, below.

In one or more embodiments of the invention, a non-accelerated line card(104) is substantially similar to an accelerated line card (102), butwithout including any sampling engine(s). A network device (100) mayhave zero or more non-accelerated line cards without departing from thescope of the invention. In one or more embodiments of the invention thatinclude at least one non-accelerated line card (104), eachnon-accelerated line card is operatively connected to at least oneaccelerated line card (102) of the network device (100).

While FIG. 1 shows a configuration of components, other configurationsmay be used without departing from the scope of the invention. Forexample, various components may be combined to create a singlecomponent. As another example, the functionality performed by a singlecomponent may be performed by two or more components. Accordingly,embodiments disclosed herein should not be limited to the configurationof components shown in FIG. 1.

FIG. 2 shows a network chip (200) coupled to a sampling engine (214) inaccordance with one or more embodiments of the invention. In one or moreembodiments of the invention, the network chip (200) and sampling engine(214) are each included in an accelerated line card (e.g., 102 fromFIG. 1) of a network device (e.g., 100 from FIG. 1). Additionally oralternatively, the network chip (200) may be included in anon-accelerated line card and operatively connected to another networkchip that is included in an accelerated line card that also includes thesampling engine (214). In one or more embodiments of the invention, anetwork chip (200) includes an external interface (204), a trafficreceiver, (206), a traffic processor (208), and internal interface(210), a traffic sampler (212), and a collector interface device (220).In one or more embodiments of the invention, a sampling engine includesa datagram manager (216) and storage (218). Each of these components isdescribed below.

In one or more embodiments of the invention, a network device chip (200)is substantially similar to the network device chips (106, 108, 110,112) discussed above in the description of FIG. 1, above, and thecomponents of the network chip shown in FIG. 2 are implemented using allor any portion of the hardware, software, firmware, etc. described aboveas being included in a network device chip (200).

In one or more embodiments of the invention, the network chip includesan external interface (204). In one or more embodiments of theinvention, an external interface is hardware (e.g., circuitry) and/orsoftware that is operatively connected to one or more physical networkinterfaces (e.g., optical transceivers), which provide an interface toother devices on a network (e.g., computing devices (not shown), othernetwork devices (not shown), etc.). In one or more embodiments of theinvention, an external interface (204) includes functionality to receivenetwork traffic data units and/or transmit network traffic data unitsthat have been processed by one or more network chips of a networkdevice.

In one or more embodiments of the invention, the network chip (200)includes a traffic receiver (206). In one or more embodiments of theinvention, a traffic receiver (206) is any hardware (e.g., circuitry),software, firmware, or any combination thereof, that includesfunctionality to receive network traffic data units from an operativelyconnected external interface (204) of a network chip (200), and todetermine whether to capture (e.g., mirror/copy) a given network trafficdata unit in order to sample the network data traffic unit. In one ormore embodiments of the invention, a traffic receiver (206) includesfunctionality to capture network traffic data units based on a trafficsampling rate, which may be pre-configured for a network chip (200)and/or may be configurable (e.g., by a network device user, by softwareexecuting on the network device, etc.). As an example, the trafficreceiver (206) may include functionality to count the number of receivednetwork traffic data units, and capture one out of every thousandnetwork traffic data units in order to sample the network traffic flow.In one or more embodiments of the invention, the traffic receiver isalso operatively connected to a traffic processor (208) (discussedbelow) and includes functionality to propagate received network trafficdata units to the traffic processor (206) for processing.

In one or more embodiments of the invention, the network chip (200)includes a traffic sampler (212). In one or more embodiments of theinvention, a traffic sampler (212) is any hardware (e.g., circuitry),software, firmware, or any combination thereof that includesfunctionality to receive network traffic data units that have beencaptured for sampling by an operatively connected traffic receiver(206). In one or more embodiments of the invention, the traffic sampler(212) includes functionality to perform processing and/or obtain fromother network chip components (e.g., traffic processor (208)) variousitems of information related to a captured network traffic data unit.Such information may include, but is not limited to, the ingressinterface of the network traffic data unit, the egress interface of thenetwork traffic data unit, an ingress or egress virtual local areanetwork (VLAN) associated with the network traffic data unit, a next hopdevice for the network traffic data unit, a reverse path determinedusing the source of the network traffic data unit, a network tunnelingprotocol network segment associated with the network traffic data unit,etc.

A traffic sampler (212) may include functionality to include any or allinformation relating to the network traffic data unit as part of aheader (i.e., a sampling header), and to prepend or append such a headerto the network data traffic unit. In one or more embodiments of theinvention, a network traffic data unit with such a header prepended isreferred to as a sampled network traffic data unit (see description ofFIG. 4A, below). Other headers (e.g., an Ethernet header) may also beprepended or appended to the sampled network traffic data unit by thetraffic sampler (212) without departing from the scope of the invention.

In one or more embodiments of the invention, the traffic sampler (212)is operatively connected to a dedicated interface (not shown) thatprovides an operative connection to a sampling engine (214). In one ormore embodiments of the invention, the traffic sampler (212) includesfunctionality to transmit a sampled network traffic data unit to thesampling engine (214) using such a dedicated interface, which may be anotherwise unused interface of the network chip (200).

In one or more embodiments of the invention, a traffic processor (208)is any hardware (e.g., circuitry), software, firmware, or anycombination thereof, that includes functionality to process receivednetwork traffic data units. As discussed above in the description ofFIG. 1 with respect to network chips, a traffic processor (208) mayinclude functionality to receive network traffic data units, processnetwork traffic data units to determine what action should be taken(e.g., extract information, determine where to send the network trafficdata unit, drop the network traffic data unit, etc.) in response toreceipt of the network traffic data units. In one or more embodiments ofthe invention, the traffic processor (208) is operatively connected toan external interface (204), a traffic receiver (206), an internalinterface (210), and/or a collector interface device (220).

In one or more embodiments of the invention, the network chip (200)includes an internal interface (210). In one or more embodiments of theinvention, an internal interface (210) is any hardware (e.g.,circuitry), software, firmware, or any combination thereof, that isoperatively connected to other components internal to a network device.In one or more embodiments of the invention, an internal interface (210)includes functionality to receive network traffic data units (e.g., froma traffic processor (208)) and transmit such network traffic data unitsto other components of a network device (e.g., other network chips). Asan example, a traffic processor (208) may receive a network traffic dataunit, process the network traffic data unit to determine that it shouldbe sent from a physical interface operatively connected to a networkchip of another line card of a network device, and, based on theprocessing, send the network traffic data unit to the appropriateinternal interface (210) that provides operative connectivity to theother network chip.

In one or more embodiments of the invention, the network chip (200)includes a collector interface device (220). In one or more embodimentsof the invention, a collector interface is any hardware (e.g.,circuitry), software, firmware, or any combination thereof, that includefunctionality to receive flow datagrams (see description of FIG. 4C,below) from an operatively connected sampling engine (214). In one ormore embodiments of the invention, the collector interface device (220)is operatively connected to the sampling engine (214) via a dedicatedinterface (not shown) of the network chip (200), which may or may not bethe same dedicated interface that operatively connects the samplingengine to the traffic sampler (212) (discussed above). In one or moreembodiments of the invention, the collector interface device (220)includes functionality to provide flow network traffic data units (seeFIG. 4D, below) to the traffic processor (208) to be propagated towardsone or more collector devices (e.g., via an external interface (204) orinternal interface (210) of the network chip (200)).

In one or more embodiments of the invention, a sampling engine (214) issubstantially similar to the sampling engine (114) discussed above inthe description of FIG. 1, above, and the components of the samplingengine (214) shown in FIG. 2 are implemented using all or any portion ofthe hardware, software, firmware, storage, etc. described above as beingincluded in a sampling engine.

In one or more embodiments of the invention, the sampling engine (214)includes and/or is operatively connected to storage (218). In one ormore embodiments of the invention, the storage (218) is any form of datastorage (e.g., SRAM, flash memory, etc.). In one or more embodiments ofthe invention, the storage (218) includes functionality to store flowsamples. In one or more embodiments of the invention, a flow sample isat least a portion of a network traffic data unit, with a headerprepended including additional information related to the networktraffic data unit (i.e., sample information). Flow samples are discussedfurther in the description of FIG. 4B, below.

In one or more embodiments of the invention, the sampling engine (214)includes a datagram manager (216). In one or more embodiments of theinvention, a datagram manager (216) is any hardware (e.g., circuitry),software, firmware, stored logic, etc. that includes functionality toreceive sampled network traffic data units, to process the receivednetwork data traffic units to obtain flow samples, to store the flowsamples in operatively connected storage (216), to construct flowdatagrams using the flow samples, and to send the flow datagrams to anetwork chip to be propagated towards a one or more collectors.

While FIG. 2 shows a configuration of components, other configurationsmay be used without departing from the scope of the invention. Forexample, various components may be combined to create a singlecomponent. As another example, the functionality performed by a singlecomponent may be performed by two or more components. Also, althoughFIG. 1 and FIG. 2 show a sampling engine separate from network chips,one of ordinary skill in the art and having the benefit of this DetailedDescription will appreciate that some or all of the functionality of thesampling engine may alternately be performed by a network chip.Accordingly, embodiments disclosed herein should not be limited to theconfiguration of components shown in FIG. 2.

FIG. 3 shows a software architecture (300) in accordance with one ormore embodiments of the invention. As shown in FIG. 3, the softwarearchitecture (300) includes a sampling acceleration manager (302), asystem database (304), a sampling engine manager (314), a network chipmanager (306), any number of network chips (e.g., network chip A (308),network chip B (310)), and any number of sampling engines (e.g.,sampling engine (312)). Each of these components is described below. Inone or more embodiments of the invention, the software architecture(300) shown in FIG. 3 executes on hardware of a network device (e.g.,one or more processors), and functions to manage the hardware and/orsoftware components (e.g., network chips, sampling engines, softwareagents, etc.) of the accelerated network traffic sampling describedherein.

In one or more embodiments of the invention, the network chips (308,310) and sampling engines (312, 314) shown in FIG. 3 are substantiallysimilar to the network chips and sampling engines discussed above in thedescriptions of FIG. 1 and FIG. 2.

In one or more embodiments of the invention, the software architecture(300) includes a sampling acceleration manager (302). In one or moreembodiments of the invention, the sampling acceleration manager (302)includes software instructions stored at least temporarily on thestorage or memory of a network device (e.g., network device (100) ofFIG. 1) that are executed by one or more processors of the networkdevice to perform various operations. Such operations may be related tomanagement of network traffic sampling acceleration from a networkdevice point of view (e.g., rather than a single sampling engine, linecard, or network chip). Such operations may include, but are not limitedto, tracking insertions and removals of accelerated line cards andnon-accelerated line cards, load balancing network chips among availablesampling engines of accelerated line cards, maintaining data related toacceleration related capabilities within the system in a system database(304) (e.g., line card X supports acceleration, line card Y does not,etc.), and managing and/or initiating execution of a software agent (notshown) that performs functionality relating to network traffic sampling(e.g., monitoring counters). As a non-limiting example, a network deviceprocessor (not shown) may include functionality execute a software agentconfigured to gather information from any number of counters of thenetwork device, and to initiate transmission of such information towardsan entity configured to receive such information, such as a collectorexecuting on a different physical or virtual device that collectsinformation related to network traffic flow (e.g., an sFlow collector).In such an example, the counters may count any type of recurring event,such as a quantity of network traffic data units processed by allsampling engines of a network device in a given time period, a totalnumber of flow datagrams generated by sampling engines of the networkdevice in a given time period, etc. In one or more embodiments of theinvention, the sampling acceleration manager manages, configures, and/orcoordinates the actions, at least in part, of such a software agent.

In one or more embodiments of the invention, the software architecture(300) includes a system database (304). In one or more embodiments ofthe invention, a system database (304), as used herein, is any type ofstorage unit and/or device (e.g., a file system, database, collection oftables, or any other storage mechanism) for storing data. Further, thesystem database (304) may include multiple different storage unitsand/or devices. In one or more embodiments of the invention, the systemdatabase stores information related to accelerating network trafficsampling, configuring network chips, and/or configuring samplingengines, etc.

In one or more embodiments of the invention, the software architectureincludes any number of network chip managers (306). In one or moreembodiments of the invention, a network chip manager (306) includessoftware instructions stored at least temporarily on the storage ormemory of a network device that are executed by one or more processorsof the network device to perform various operations related toconfiguring and managing network chips. Such operations may include, butare not limited to, configuring the coupling between a given networkchip and a sampling engine, configuring the network chip to samplenetwork traffic data units at a certain sampling rate, configuring anetwork chip to prepend or append a sampling header to a sampled networktraffic data unit before transmitting the sampled network traffic dataunit towards a sampling engine either directly or via another networkchip, and/or configuring interfaces, VOQs, etc. The softwarearchitecture (300) may include any number of network chip managers, witheach managing any number of network chips, without departing from thescope of the invention.

In one or more embodiments of the invention, the software architecture(300) includes a sampling engine manager (314). In one or moreembodiments of the invention, a sampling engine manager (314) includessoftware instructions stored at least temporarily on the storage ormemory of a network device that are executed by one or more processorsof the network device to perform various operations related toconfiguring and managing/controlling sampling engines. For example, inembodiments of the invention that include an SRAM based FPGA as asampling engine, a sampling engine manager is responsible forconfiguring the logic used by the FPGA to perform the functionality ofthe sampling engine as described herein, at least in part, byimplementing the logic in the SRAM of the FPGA. The softwarearchitecture (300) may include any number of sampling managers, witheach managing any number of sampling engines, without departing from thescope of the invention.

While FIG. 3 shows a configuration of components of a softwarearchitecture, other configurations may be used without departing fromthe scope of the invention. For example, various components may becombined to create a single component. As another example, thefunctionality performed by a single component may be performed by two ormore components. Accordingly, embodiments disclosed herein should not belimited to the configuration of components shown in FIG. 3.

FIG. 4A shows an exemplary sampled network traffic data unit structurein accordance with one or more embodiments of the invention. Thefollowing example is for explanatory purposes only and not intended tolimit the scope of the invention.

As shown in FIG. 4A, the sampled network traffic data unit (400)includes a network traffic data unit (408), a frame sequence check (FCS)(410), and a sampling header (402). The sampling header may include anetwork traffic data unit information header (406) and an Ethernetheader (404). Each of these components is described below.

In one or more embodiments of the invention, the sampled network trafficdata unit (400) is generated by a network chip. In one or moreembodiments of the invention, the sampled network traffic data unit(400) includes a network traffic data unit (408) that was captured bythe network chip according to a sampling rate of the network chip. Forexample, a network chip may be configured to capture one out of every1,000 received media access control (MAC) frames received by a networkdevice at a given physical network interface. In one or more embodimentsof the invention, the network traffic data unit (408) includes a payload(not shown) and any headers (e.g., MAC information, TCP/IP information,etc.) that have been prepended or appended to the payload before itsreceipt by the network chip.

In one or more embodiments of the invention, the sampled network trafficdata (400) unit includes a sampling header (402), which includes atleast a network traffic data unit information header (406), and mayoptionally also include an Ethernet header (404). In one or moreembodiments of the invention, network traffic data unit informationheader (406) is prepended or appended to a captured network traffic dataunit (408) by a network chip. The network traffic data unit informationheader (406) may include information related to the network traffic dataunit (408) including, but not limited to, original source information,original destination information, ingress interface, egress interface(or multiple egress interfaces if the network traffic data unit is to bemulticast from the network device), ingress VLAN, egress VLAN, tunnelingprotocol network segment (ingress and/or egress), reverse pathinformation, a sampling rate, and/or any other data relevant to thenetwork traffic data unit (408) and/or to a flow sampling system (e.g.,sFlow). In one or more embodiments of the invention, an Ethernet header(404), if included in the sampling header, includes Ethernet information(e.g., source MAC address, destination MAC address, etc.) that may ormay not convey useful information to a sampling engine or be used, inpart, in order to transmit the sampled network traffic data unit (400)towards the sampling engine.

The sampling header may also include information that identifies certaincharacteristics of a network traffic data unit that may be relevant to acollector implementing a network traffic analyzer standard. As anon-limiting example, if the network traffic analyzer standard beingimplemented is sFlow, the content of the sampling header relating to anegress interface may be varied. In such an example, a regular unicastnetwork traffic data may maintain the correct egress interface, amulticast network traffic data may have an unknown interface set, anetwork traffic data intended for a processor of the network device thatincludes the network chip adding the sampling header may have the egressinterface set to identify the network device, and a network traffic datathat is to be dropped (e.g., due to a rule within an access control list(ACL)) may have the egress interface set to indicate that the networktraffic data is to be dropped.

In one or more embodiments of the invention, the sampled network trafficdata unit (400) includes a FCS (410). In one or more embodiments of theinvention, a FCS (410) is an error detecting code that is added to thesampled network traffic data unit, and may or may not be used (e.g., bya sampling engine) to determine if the sampled network traffic data unit(400) is damaged in some way.

FIG. 4B shows an exemplary flow sample structure in accordance with oneor more embodiments of the invention. The following example is forexplanatory purposes only and not intended to limit the scope of theinvention.

As shown in FIG. 4B, the flow sample (420) includes a network trafficdata unit portion (424) and a flow sample header (422). Each of thesecomponents is described below.

In one or more embodiments of the invention, a flow sample (420) isgenerated by a sampling engine using a sampled network traffic data unit(e.g., sampled network traffic data unit (400) of FIG. 4A) received froma network chip. In one or more embodiments of the invention, the flowsample (420) is generated by determining or obtaining various items ofinformation related to the network traffic data unit (408) of thesampled network traffic data unit (400) (e.g., information from thesampling header (402), a size of the network traffic data unit, etc.,which may collectively be referred to as sample information) and usingsuch items of information to include in a flow sample header (422),which is prepended or appended to a network traffic data unit portion(424). In one or more embodiments of the invention, the network trafficdata unit portion (424) is a truncated portion (e.g., 128 bytes) of thenetwork traffic data unit (408) included in the received sampled networktraffic data unit (400). As an example, a truncated portion may be thefirst 128 bytes of the network traffic data unit. In one or moreembodiments of the invention, the truncated portion includes at leastthe one or more headers prepended to the payload of the network trafficdata unit. In one or more embodiments of the invention, a flow sample(420) is stored in storage of a sampling engine at least until they areincluded in a flow datagram (described below).

FIG. 4C shows an exemplary flow datagram structure in accordance withone or more embodiments of the invention. The following example is forexplanatory purposes only and not intended to limit the scope of theinvention.

As shown in FIG. 4C, the flow datagram (440) includes a flow datagramheader (442) and one or more flow samples (e.g., flow sample A (444),flow sample N (446)). Each of these components is described below.

In one or more embodiments of the invention, the flow samples (444, 446)included in the flow datagram (440) are set of flow samples created bythe sampling engine (such as the flow sample discussed above in thedescription of FIG. 4B), and are obtained from storage of the samplingengine. In one or more embodiments of the invention, the quantity offlow samples included in a given flow datagram (and, thus, the size ofthe flow datagram) is based, at least in part, on a maximum transmissionunit (MTU) size associated with a path to one or more collectors

In one or more embodiments of the invention, the flow datagram header(442) includes, but is not limited to, a sampling technology versionnumber (e.g., sFlow version number), IP version (e.g., IPv4, IPv6), anIP address of a software agent executing on the network device, a flowdatagram sequence number, an uptime of the network device, a number offlow samples included in the flow datagram (440), and/or any otherinformation related to the flow samples, the flow datagram, the networkdevice, the sampling engine, etc. In one or more embodiments of theinvention, the flow datagram header (442) also includes at least aportion of a user datagram protocol (UDP) header, such as, for example,a UDP port number for directing the flow datagram to a collectorexecuting on a destination computing device. The flow datagram header(422) may include additional and/or different information withoutdeparting from the invention.

FIG. 4D shows an exemplary flow network traffic data unit structure inaccordance with one or more embodiments of the invention. The followingexample is for explanatory purposes only and not intended to limit thescope of the invention.

As shown in FIG. 4D, the flow network traffic data unit (460) includesan Ethernet header (462), an IP header (464), a UDP header (466), and aflow datagram (468).

In one or more embodiments of the invention, the flow datagram (468)included in the flow network traffic data unit (460) is substantiallysimilar to the flow datagram (440) discussed above in the description ofFIG. 4C.

In one or more embodiments of the invention, each of the Ethernet header(462), the IP header (464), and the UDP header (466) is used, at leastin part, to propagate the flow datagram towards one or more collectors(e.g., sFlow collectors). In one or more embodiments of the invention,the Ethernet header (462), the IP header (464), and the UDP header (466)are added by the network chip after receipt of a flow datagram from asampling engine, by a sampling engine before sending the flow datagramto a network chip, or by a combination of the sampling engine and thenetwork chip. For example, the sampling engine may add a UDP headerintended to get the flow datagram to the collector application executingon a computing device, and the Ethernet and IP headers may be added bythe network chip in order to propagate the flow datagram and UDP headerthrough a network to the computing device on which the collectorexecutes. In one or more embodiments of the invention, the Ethernetheader includes at least a source and a destination MAC address, and theIP header includes at least a source and a destination IP address. Inone or more embodiments of the invention, the UDP header includes atleast a UDP port number. Although FIG. 4D shows a flow network trafficdata unit (460) as including an Ethernet header, and IP header, and aUDP header, one of ordinary skill in the art and having the benefit ofthis Detailed Description will appreciate that headers of other networkprotocols may additionally or alternatively be prepended or appended toa flow datagram in order to propagate the flow datagram towards one ormore collectors.

FIG. 5, FIG. 6, FIG. 7, and FIG. 8 show flowcharts in accordance withone or more embodiments of the invention. While the various steps in theflowcharts are presented and described sequentially, one of ordinaryskill in the art and having the benefit of this Detailed Descriptionwill appreciate that some or all of the steps may be executed indifferent orders, may be combined or omitted, and some or all of thesteps may be executed in parallel. Furthermore, the steps may beperformed actively or passively. For example, some steps may beperformed using polling or be interrupt driven in accordance with one ormore embodiments of the invention. By way of an example, determinationsteps may not require a processor or other device component to processan instruction unless a condition exists (e.g., a sampling rate dictatesa network traffic data unit should be sampled) in accordance with one ormore embodiments of the invention.

FIG. 5 shows a flowchart describing a method for accelerating networktraffic sampling in accordance with one or more embodiments of theinvention.

In Step 500, a network traffic data unit is received at a network chipof a network device. For example, a network data traffic unit, such as aMAC frame and/or an IP packet, is transmitted from some other device,such as a computing device or another network device, towards adestination. In such scenarios, a network device that includes a networkchip may receive the network data traffic unit in order to process thenetwork data traffic unit and, if appropriate, propagate the networkdata traffic unit towards the destination. In one or more embodiments ofthe invention, a network data traffic unit is received at a physicalnetwork interface of the network device, and then propagated to thenetwork chip.

In Step 502, the network chip that received the network data trafficunit in Step 500 selects/captures the network data traffic unit based ona network traffic sampling rate. In one or more embodiments of theinvention, selecting/capturing a network data traffic unit includesmirroring the network data traffic unit, which may include generating acopy of the network traffic data unit and then storing the generatedcopy. In one or more embodiments of the invention, the sampling rate isany rate that is pre-configured and/or configurable for the network chipand determines the rate of sampling from received network data trafficunits. For example, the sampling rate may dictate that the network chipcapture one of every 2,000 network data traffic units that are receivedby the network chip. The sampling rate may apply to the network chip asa whole, to an external interface of a network chip, to a physicalnetwork interface operatively connected to the network chip, etc.

In Step 504, a sampling header is added to the network data traffic unitby the network chip. In one or more embodiments of the invention, thesampling header includes any information related to the network trafficdata unit, and may be added to the network data traffic unit byprepending or appended the sampling header to the network data trafficunit, in order to obtain a sampled network traffic data unit.

In Step 506, the sampled network traffic data unit is transmitted to asampling engine. In one or more embodiments of the invention, thenetwork chip includes a sampling engine, and, thus transmitting thesampled network data traffic unit to the sampling engine includespropagating the sampled network data traffic unit to the portion of thenetwork chip implementing the functionality of the sampling engine. Inother embodiments of the invention, the sampling engine is separate fromthe network chip, and an operative connection exists between thesampling engine and the network chip that is configured for thetransmission of the sampled network data traffic unit from the networkchip to the sampling engine. For example, one of the internal interfacesof the network chip may be a dedicated connection to the samplingengine, and the sampled network data traffic unit is transmitted overthe direct coupling. In one or more embodiments of the invention, adedicated connection is a connection that is only used for transmittingdata between two elements, such as a network chip and a sampling engine.

As another example, the network traffic data unit may be received by anetwork chip of a non-accelerated line card, and such a network chip maypropagate the sampled network traffic data unit to a network chip of anaccelerated line card connected to a sampling engine via a dedicatedconnection, with such propagation using a configured VOQ of the networkchip of the non-accelerated line card.

As another example, a network chip of a non-accelerated line card mayprepend or append information to a captured network traffic data unitand send, via a configured VOQ, the network traffic data unit to anetwork chip of an accelerated line card which, in turn, creates asampled network traffic data unit with a sampling header based on theinformation prepended or appended by the network chip of thenon-accelerated line card. In such an example, the network chip of theaccelerated line card then transmits the sampled network traffic dataunit to a sampling engine via a dedicated interface.

In Step 508, a flow datagram is received from a sampling engine. In oneor more embodiments of the invention, the flow datagram includes aquantity of one or more flow samples, each generated based on a samplednetwork data traffic unit processed by the sampling engine. In one ormore embodiments of the invention, the network chip further processesthe flow datagram in order to prepare the flow datagram for transmissionto one or more collectors. For example, the flow datagram may includeinformation, such as a UDP header, that allows the network chip todetermine that the flow datagram is to be sent to two sFlow collectors,each operatively connected to a separate physical network interface ofthe network device. In such an example, the network chip may determineEthernet and/or IP information related to the destination collectors,make a copy of the flow datagram (so one exists for each collector),prepend or append the respective collector-related information to theflow datagram and copy to obtain two flow network data traffic units,one for each identified collector.

In Step 510, the flow network data traffic unit is sent towards one ormore collectors. In one or more embodiments of the invention, thevarious headers prepended or appended to the flow datagram in Step 508are used to determine where to transmit the flow network data trafficunit in order to propagate the flow network data traffic unit towardsthe one or more collectors. Continuing the example from Step 508, oneflow network data traffic unit in the example may be propagated towardsa first destination collector directly from an external interface of thenetwork chip, and the other flow network data traffic unit in theexample may be propagated towards a second destination collector bytransmitting the flow network traffic data unit, via an internalinterface of the network chip, through the internal fabric of thenetwork device to another network chip which, in turn, propagates thesecond flow datagram towards the second collector via an externalinterface of the second network chip.

FIG. 6 shows a flowchart describing a method for accelerating networktraffic sampling in accordance with one or more embodiments of theinvention.

In Step 600, a sampled network traffic data unit is received at asampling engine. In one or more embodiments of the invention, thesampled network traffic data unit is received from a network chip, andincludes a network traffic data unit and a sampling header.

In Step 602, the sampled network traffic data unit is processed by thesampling engine to obtain information related to the network trafficdata unit and the network traffic data unit. In one or more embodimentsof the invention, the network traffic data unit obtained is used todetermine the size of the network traffic data unit. Informationincluded in the sampling header (see description of FIG. 4A, above) ofthe received sampled network traffic data unit, the size of the networktraffic data unit, and any other information related to the networktraffic data unit may be referred to as sample information.

In Step 604, the network traffic data unit obtained in Step 602 istruncated to obtain a network traffic data unit portion. In one or moreembodiments of the invention, truncating the network traffic data unitincludes shortening the network traffic data unit by removing some ofthe network traffic data unit. For example, truncation of a networktraffic data unit may include removing all but the first 256 bytes ofthe network traffic data unit. In one or more embodiments of theinvention, truncation of the network traffic data unit occurs so thateach flow sample including a truncated network traffic data unit issmaller (e.g., less bytes than the entire network traffic data unit),which may allow for more flow samples to be included in a given flowdatagram.

In Step 606, a flow sample header is generated using the sampleinformation. In one or more embodiments of the invention, the flowsample header is prepended or appended to the network traffic data unitportion to obtain a flow sample.

In Step 608, the flow sample generated in Step 606 is stored in storageof the sampling engine, which may be storage included in and/oroperatively connected to the sampling engine.

In Step 610, a flow datagram is constructed using the flow sample and aplurality of other flow samples. In one or more embodiments of theinvention, the sampling engine stores any number of flow samples beforeaggregating the flow samples into a flow datagram. In one or moreembodiments of the invention, the number of flow samples to be includedin a given flow datagram is determined based on a maximum transmissionunit (MTU) size available on a path to one or more collectors. In one ormore embodiments of the invention, the MTU size is known to the samplingengine via configuration of the sampling engine by a sampling enginemanager. In one or more embodiments of the invention, the constructionof the flow datagram also includes prepending or appending to the flowsamples a flow datagram header (see description of FIG. 4C, above). Inone or more embodiments of the invention, the flow datagram headerincludes information related to the flow samples, as well as informationidentifying one or more collectors to which the flow datagram is to betransmitted.

In Step 612, the flow datagram constructed in Step 610 is transmitted toa network chip. The flow datagram may be transmitted to the network chipvia the same dedicated interface on which the sampled network trafficdata unit was received in Step 600, or via a different dedicatedinterface (which may be coupled to the same or a different networkchip).

In Step 614, the flow samples that where included in the flow datagramin Step 610 and sent to the network chip in Step 612 are cleared fromthe storage of the sampling engine.

FIG. 7 shows a flowchart describing a method for accelerating networktraffic sampling in accordance with one or more embodiments of theinvention.

In Step 700, a network chip of a non-accelerated line card is associatedwith a sampling engine of an accelerated line card. In one or moreembodiments of the invention, the network chip of the non-acceleratedline card is associated with the sampling engine of the accelerated linecard based on a load balancing policy such as, for example, a roundrobin policy, a least connections policy, a policy based on physicalproximity of the accelerated and non-accelerated line card, anycombination of load balancing policies, etc. Such a load balancingpolicy may be designed to balance the workload related to networktraffic flow sampling among the sampling engines in a network device.

In Step 702, the network chip of the non-accelerated line card isconfigured with a virtual output queue (VOQ) that is associated with aninternal interface of the network chip of the non-accelerated line card.In one or more embodiments of the invention, the internal interfaceassociated with the configured VOQ is operatively connected to a networkchip on the same accelerated line card as the sampling engine.

In Step 704, a network traffic data unit is received at the network chipof the non-accelerated line card. In one or more embodiments of theinvention, the network traffic data unit is received from a deviceexternal to the network device at a physical network interface of thenetwork device, and propagated to an external interface of the networkchip of the non-accelerated line card.

In Step 706, the network traffic data unit received by the network chipof the non-accelerated line card in Step 704 is selected by the networkchip to be sampled. In one or more embodiments of the invention, theselection of the network traffic data unit is based on a trafficsampling rate configured for the network chip of the non-acceleratedline card.

In Step 708, the network chip of the non-accelerated line card adds asampling header to the network traffic data unit to obtain a samplednetwork traffic data unit. Although not shown in FIG. 7, the samplingheader, in some embodiments of the invention, is instead prepended orappending to the network traffic data unit by a network chip of theaccelerated line card with the associated sampling engine based oninternal headers that are added to the network traffic data unit by thenetwork chip of the non-accelerated line card before the network trafficdata unit is transmitted via the VOQ to the network chip of theaccelerated line card.

In Step 710, the sampled network traffic data unit is transmitted fromthe network chip of the non-accelerated line card to a network chip ofan accelerated line card that includes the associated sampling engine.In one or more embodiments of the invention, the sampled network trafficdata unit is transmitted to the network chip of the accelerated linecard using the VOQ configured in Step 702.

FIG. 8 shows a flowchart describing a method for accelerating networktraffic sampling in accordance with one or more embodiments of theinvention.

In Step 800, a network device is configured with both accelerated linecards a non-accelerated line cards. In one or more embodiments of theinvention, configuring a network device with a line card includesinstalling a line card into the network device in an appropriate linecard receiving space. Any number of accelerated line cards andnon-accelerated line cards may be inserted into a network device withoutdeparting from the scope of the invention.

In Step 802, the network chips of the accelerated and non-acceleratedline cards are load balanced among the sampling engines of theaccelerated line cards. For example, three accelerated line cards, eachwith one sampling engine and two network chips, and threenon-accelerated line cards, each with two network chips, may be insertedinto a network device. In such an example, there are twelve totalnetwork chips and three total sampling engines included in the networkdevice. One non-limiting example of a way to load balance the networkchips among the sampling engines in such a scenario may includeassociating each network chip on an accelerated line card with thesampling engine of the same accelerated line card, leaving six networkchips on non-accelerated line cards to be load balanced. Two networkchips of the non-accelerated line cards may be associated with one ofthe three sampling engines of the accelerated line cards, making each ofthe three sampling engines associated with four network chips installedin the network device.

In one or more embodiments of the invention, during load balancing,whether an interface of a network device is included in a port-channelmay be accounted for. For example, if a source interface is a member ofa port-channel, then flow datagrams may set the ingress interface toidentify the port-channel rather than individual member interfaces. Insuch an example, if member interfaces are spread across differentnetwork device chips; the network device chips may ordinarily sendsampled network traffic data units to different sampling engines in thesystem, which may result, for example, in sFlow sequence number andsample pool calculation being incorrect. For example, a sequence numberfor ingress interface should be increasing monotonically. Accordingly,in one or more embodiments of the invention, the network device includefunctionality to send all sampled network traffic data units for allmembers interfaces of a particular port-channel to the same samplingengine instead of causing those interfaces to comply with an implementedload balancing scheme. Said another way, all interfaces on a networkdevice chip which has a member interface of a port-channel may notnecessarily send all sampled network traffic data units to the samesampling engine. Thus, in one or more embodiments of the invention,port-channel load-balancing is calculated independently.

In Step 804, a determination is made about whether a non-acceleratedline card has been removed from the network device. In one or moreembodiments of the invention, if no non-accelerated line card has beenremoved from the network device, the process proceeds to Step 806. Inone or more embodiments of the invention, if a non-accelerated line cardhas been removed from the network device, the process proceeds to Step808.

Turning to Step 806, a determination is made about whether anaccelerated line card has been removed from a network device. In one ormore embodiments of the invention, if no accelerated line card has beenremoved from the network device, the process proceeds to Step 810. If,on the other hand, an accelerated line card has been removed from thenetwork device, the process proceeds to Step 808.

In Step 808, all network chips of the line card determined to have beenremoved in Step 804 or Step 808 are removed from a list of network chipsinstalled in the network device (e.g., a list maintained in a systemdatabase), and the remaining network chips are rebalanced among theremaining sampling engines installed in the network device. If theremoved line card was a non-accelerated line card, all network chips ofthe removed non-accelerated line card are removed from a list ofinstalled network chips, and a re-balancing of remaining network chipsamong the installed sampling engines is triggered. For example, in thescenario where there are twelve network chips (two on each of threeaccelerated line cards and two on each of three non-accelerated linecards) distributed evenly among three sampling engines, removing onenon-accelerated line card and the two network chips of thenon-accelerated line card from the network device leaves ten networkchips in the network device, four of which are in non-accelerated linecards. In such an example, based on a load balancing policy, the fournetwork chips of the two remaining non-accelerated line cards may bedistributed such that two of the sampling engines are associated withthree of the ten remaining network chips, with one sampling engine beingassociated with the other four remaining network chips. Similarly, if anaccelerated line card is removed, then the quantity of remaining networkchips may be redistributed among the remaining sampling engines based ona load balancing policy. After the rebalancing occurs, the process ends.

Turning to Step 810, a determination is made about whether anaccelerated line card has been added to a network device. In one or moreembodiments of the invention, if an accelerated line card has been addedto a network device, the process proceeds to Step 814. If, on the otherhand, no accelerated line card is added to the network device, theprocess proceeds to Step 812.

In Step 812, a determination is made about whether a non-acceleratedline card has been added to a network device. If no non-accelerated linecard is added to the network device, the process ends. If, on the otherhand, a non-accelerated line card is added to the network device, theprocess proceeds to Step 814.

In Step 814, a re-balancing is triggered that distributes the networkchips of the network device, including the network chips determined tobe added in either Step 810 or Step 812, among the sampling engines ofthe network device according to a load balancing policy. In one or moreembodiments of the invention, if an accelerated line card was determinedto have been added to the network device, the re-balancing redistributesthe network chips accounting for the addition of more sampling engines.In one or more embodiments of the invention, if a non-accelerated linecard is determined to have been added, then the re-balancing distributesthe existing and added network chips among the already installedsampling engines of the network device.

FIG. 9A, FIG. 9B, FIG. 9B, and FIG. 9D show an example in accordancewith one or more embodiments of the invention. The following example isfor explanatory purposes only and not intended to limit the scope of theinvention.

Referring to FIG. 9A, consider a scenario in which a network device(900) includes two accelerated line cards (902, 916) and twonon-accelerated line cards (910, 924). Accelerated line card A (902)includes network chip A (904), network chip B (906), and sampling engineA (908). Accelerated line card B (916) includes network chip E (918),network chip F (920), and sampling engine B (922). Non-accelerated linecard A (910) includes network chip C (912), and network chip D (914).Non-accelerated line card B (924) includes network chip G (926), andnetwork chip H (928). Each network chip is configured with a samplingrate of one of every 1000 network traffic data units received by thenetwork chip to be mirrored and the copy sent to a sampling engine.

In such a scenario, when network chip A (904) receives a network trafficdata unit, and the sampling rate dictates that the network traffic dataunit is to be sampled, then network chip A mirrors the network trafficdata unit and prepends a sampling header to the copy of the networktraffic data unit. The sampled network traffic data unit is then sent tosampling engine A (908). Sampling engine A (908) then processes thesampled network traffic data unit to obtain the sampling information andthe network traffic data unit. Sampling engine A (908) then truncatesthe network traffic data unit, prepends a flow sample header, and storesthe result as a flow sample. Once a certain number of flow samples havebeen stored (e.g., based on an MTU size for a path to a collector),sampling engine A obtains the stored flow samples, aggregates them, andprepends a flow datagram header to obtain a flow datagram. The flowdatagram is then transmitted to network chip A (904), where it isprocessed to determine the one or more collectors to which the flowdatagram is to be transmitted. Based on the aforementioneddetermination, a quantity of flow datagrams is generated to match thenumber of collectors, where the generates flow datagrams includeappropriate MAC and IP header information to facilitate transmission ofthe flow datagrams towards the various collectors.

Additionally, in the aforementioned scenario, a load balancing policyhas caused network chip A (904), network chip B (906), network chip C(912), and network chip D (914) to be associated with sampling engine A(908), while network chip E (918), network chip F (920), network chip G(926), and network chip H (9287) are associated with sampling engine B(922). Thus, each of the two sampling engines is associated with fournetwork chips, for a balanced distribution of network chips among thesampling engines.

When a network traffic data unit is received by network chip G (926)and, according to a sampling rate configured for network chip G, is tobe sampled, then network chip G mirrors the network traffic data unitand prepends a sampling header to the network traffic data unit toobtain a sampled network traffic data unit. The sampled network trafficdata unit is then transmitted, via a VOQ of network chip G (926), tonetwork chip E (918) of accelerated line card B (916) based on theassociation between network chip G (926) and sampling engine B (922).Network chip E, in turn, propagates the sampled network traffic dataunit to sampling engine B, which processes the sampled network trafficdata unit, creates a flow sample, and stores the flow sample with otherflow samples. Next, the stored flow samples are aggregated into a flowdatagram, a flow datagram header is prepended, and the flow datagram issent to either network chip E (918) or network chip F (920) for furtherprocessing (e.g., prepending additional headers) and propagated towardsone or more collectors.

Continuing the example with FIG. 9B, in FIG. 9B accelerated line card Bis removed from the network device (900), triggering a rebalancing ofthe remaining network chips among the remaining sampling engines. Inthis scenario, because the network device had only two installedsampling engines, the six remaining network chips (904, 906, 612, 914,926, and 928) are each assigned to sampling engine A (908).

In FIG. 9C, accelerated line card B (916) is added back to the networkdevice. This addition again triggers a re-balancing in which fournetwork chips are gain associated with each of the two sampling engines.

In FIG. 9D, non-accelerated line card A (910) is removed from thenetwork device. This again causes a re-balancing of the network chips.In the scenario shown in FIG. 9D, there are now two sampling engines,and six network chips. Thus, according to a load balancing policy thatstrives for equitable distribution of network chips among samplingengines, three network chips (904, 906, and 926) are assigned tosampling engine A, while the other three remaining network chips (918,920, and 928) are associated with sampling engine B (922).

One or more embodiments of the invention may facilitate acceleration ofnetwork traffic sampling (i.e., flow sampling) by offloading at least aportion of sampling activities from one or more processors of a networkdevice to one or more sampling engines, which may improve performance ofthe network device.

While the invention has been described with respect to a limited numberof embodiments, those skilled in the art, having benefit of thisdisclosure, will appreciate that other embodiments can be devised whichdo not depart from the scope of the invention as disclosed herein.Accordingly, the scope of the invention should be limited only by theattached claims.

What is claimed is:
 1. A method of accelerating monitoring of networktraffic, the method comprising: receiving, at a network chip of anaccelerated line card of a network device, a network traffic data unit,wherein the network chip comprises first circuitry; capturing, by thenetwork chip, the network traffic data unit based on a traffic samplingrate; adding, by the network chip, a sampling header to the networktraffic data unit to obtain a sampled network traffic data unit; sendingthe sampled network traffic data unit from the network chip to ahardware sampling engine on the accelerated line card, wherein thehardware sampling engine comprises second circuitry and is configuredto: truncate the sampled network traffic data unit to obtain a networktraffic data unit portion, and construct a flow datagram comprising thenetwork traffic data unit portion and a flow datagram header; receiving,at the network chip, from the hardware sampling engine, the flowdatagram comprising the network traffic data unit portion and the flowdatagram header; generating, by the network chip, a flow network datatraffic unit comprising the flow datagram and a destination address of acollector; and transmitting, by the network chip, the flow network datatraffic unit towards the collector.
 2. The method of claim 1, furthercomprising, before receiving the network traffic data unit, configuringthe traffic sampling rate for the network chip.
 3. The method of claim1, wherein the sampling header specifies one selected from a groupconsisting of: an ingress interface associated with the network trafficdata unit, an egress interface associated with the network traffic dataunit, an ingress virtual local area network (VLAN) associated with thenetwork traffic data unit, an egress VLAN associated with the networktraffic data unit, a next-hop associated with the network traffic dataunit, and a reverse path based on a source lookup associated with thenetwork traffic data unit.
 4. The method of claim 3, wherein the egressinterface may be set to identify one selected from a group consisting ofa regular network traffic data unit, a multicast network traffic dataunit, a network traffic data unit being sent to a processor of thenetwork device, and a dropped network traffic data unit.
 5. The methodof claim 1, wherein the flow datagram comprises: the flow datagramheader comprising information associated with the collector; and a flowsample comprising the network traffic data unit portion.
 6. Anon-transitory computer readable medium comprising instructions that,when executed by a processor, configure a network device to perform amethod of accelerating monitoring of network traffic, the methodcomprising: receiving, at a network chip of an accelerated line card ofthe network device, a network traffic data unit, wherein the networkchip comprises first circuitry; capturing, by the network chip, thenetwork traffic data unit based on a traffic sampling rate; adding, bythe network chip, a sampling header to the network traffic data unit toobtain a sampled network traffic data unit; sending the sampled networktraffic data unit from the network chip to a sampling engine on theaccelerated line card, wherein the sampling engine comprises secondcircuitry and is configured to: truncate the sampled network trafficdata unit to obtain a network traffic data unit portion, and construct aflow datagram comprising a network traffic data unit portion and a flowdatagram header; receiving, at the network chip, from the samplingengine, the flow datagram comprising the network traffic data unitportion and the flow datagram header; generating, by the network chip, aflow network data traffic unit comprising the flow datagram and adestination address of a collector; and transmitting, by the networkchip, the flow network data traffic unit towards the collector.
 7. Thenon-transitory computer readable medium of claim 6, wherein the methodimplemented by the instructions further comprises, before receiving thenetwork traffic data unit, configuring the traffic sampling rate for thenetwork chip.
 8. The non-transitory computer readable medium of claim 6,wherein the sampling header specifies at least an ingress interfaceassociated with the network traffic data unit and an egress interfaceassociated with the network traffic data unit.
 9. The non-transitorycomputer readable medium of claim 6, wherein the flow datagramcomprises: the flow datagram header comprising information associatedwith the collector; and a flow sample comprising the network trafficdata unit portion.
 10. A method of accelerating monitoring of networktraffic, the method comprising: receiving, at a hardware sampling engineof an accelerated line card and from a network chip of the acceleratedline card, a sampled network traffic data unit generated by the networkchip and comprising a sampling header and a network traffic data unit,wherein the hardware sampling engine comprises first circuitry and thenetwork chip comprises second circuitry; processing, by the hardwaresampling engine, the sampled network traffic data unit to obtain sampleinformation; truncating, by the hardware sampling engine, the networktraffic data unit to obtain a network traffic data unit portion;generating, by the hardware sampling engine, a flow sample headercomprising the sample information; storing, by the hardware samplingengine, in storage of the hardware sampling engine, a flow samplecomprising the flow sample header and the network traffic data unitportion; constructing, by the hardware sampling engine, a flow datagramcomprising the flow sample and a plurality of other flow samples;sending, by the hardware sampling engine, the flow datagram to thenetwork chip, wherein the network chip: generates a flow network trafficdata unit comprising the flow datagram and a destination address of acollector, and transmits the flow network traffic data unit towards thecollector; and clearing, by the hardware sampling engine, the flowsample and the plurality of other flow samples from the storage of thehardware sampling engine.
 11. The method of claim 10, whereinconstructing the flow datagram comprises: obtaining, by the hardwaresampling engine, the flow sample and the plurality of other flow samplesfrom the storage of the hardware sampling engine, wherein the flowdatagram further comprises a flow datagram header, and the flow datagramheader comprises collector information associated with a collector. 12.The method of claim 11, wherein a quantity of flow samples to beincluded in the flow datagram is determined by the hardware samplingengine based on a maximum transmission unit (MTU) associated with a pathto the collector.
 13. The method of claim 10, further comprising, beforetruncating the network traffic data unit, further processing the networktraffic data unit to obtain a size of the network traffic data unit,wherein the flow sample header further comprises the size of the networktraffic data unit.
 14. A non-transitory computer readable mediumcomprising instructions that, when executed by a processor, configure anetwork device to perform a method of accelerating monitoring of networktraffic, the method comprising: receiving, at a hardware sampling engineof an accelerated line card and from a network chip of the acceleratedline card, a sampled network traffic data unit generated by the networkchip and comprising a sampling header and a network traffic data unit,wherein the hardware sampling engine comprises first circuitry and thenetwork chip comprises second circuitry; processing, by the hardwaresampling engine, the sampled network traffic data unit to obtain sampleinformation; truncating, by the hardware sampling engine, the networktraffic data unit to obtain a network traffic data unit portion;generating, by the hardware sampling engine, a flow sample headercomprising the sample information; storing, by the hardware samplingengine, in storage of the hardware sampling engine, a flow samplecomprising the flow sample header and the network traffic data unitportion; constructing, by the hardware sampling engine, a flow datagramcomprising the flow sample and a plurality of other flow samples;sending, by the hardware sampling engine, the flow datagram to thenetwork chip, wherein the network chip: generates a flow network trafficdata unit comprising the flow datagram and a destination address of acollector, and transmits the flow network traffic data unit towards thecollector; and clearing, by the hardware sampling engine, the flowsample and the plurality of other flow samples from the storage of thehardware sampling engine.
 15. The non-transitory computer readablemedium of claim 14, wherein constructing the flow datagram comprises:obtaining the flow sample and the plurality of other flow samples fromthe storage of the hardware sampling engine, wherein the flow datagramfurther comprises a flow datagram header, and the flow datagram headercomprises collector information associated with a collector.
 16. Thenon-transitory computer readable medium of claim 15, wherein a quantityof flow samples to be included in the flow datagram is determined by thehardware sampling engine based on a maximum transmission unit (MTU)associated with a path to the collector.
 17. The non-transitory computerreadable medium of claim 14, wherein the method implemented by theinstructions further comprises, before truncating the network trafficdata unit, further processing the network traffic data unit to obtain asize of the network traffic data unit, wherein the flow sample headerfurther comprises the size of the network traffic data unit.
 18. Asystem for accelerating monitoring of network traffic, the systemcomprising: a network chip comprising first circuitry, on an acceleratedline card, and configured to: receive a network traffic data unit;select the network traffic data unit based on a traffic sampling rate;add a sampling header to the network traffic data unit to obtain asampled network traffic data unit; send the sampled network traffic dataunit from the network chip to a sampling engine; receive, from thesampling engine, a flow datagram comprising a network traffic data unitportion; generate a flow network data traffic unit comprising the flowdatagram and a destination address of a collector; and send the flownetwork data traffic unit to the collector; and the sampling enginecomprising second circuitry, on the accelerated line card, operativelyconnected to the network chip, comprising storage, and configured to:receive, at the sampling engine and from the network chip, the samplednetwork traffic data unit comprising the sampling header and the networktraffic data unit; process the sampled network traffic data unit toobtain sample information and the network traffic data unit; truncatethe network traffic data unit to obtain a network traffic data unitportion; generate a flow sample header comprising the sampleinformation; store, in the storage of the sampling engine, a flow samplecomprising the flow sample header and the network traffic data unitportion; construct the flow datagram comprising the flow sample and aplurality of other flow samples; send the flow datagram to the networkchip; and clear the flow sample and the plurality of other flow samplesfrom the storage of the sampling engine.
 19. The system of claim 18,wherein the sampling engine is a field programmable gate array (FPGA).20. The system of claim 18, wherein the sampling engine is operativelyconnected to the network chip via a dedicated interface of the networkchip.